Estimators of Variance for K-Fold Cross-Validation
نویسندگان
چکیده
1 Motivations In machine learning, the standard measure of accuracy for models is the prediction error (PE), i.e. the expected loss on future examples. We consider here the i.i.d. regression or classification setups, where future examples are assumed to be independently sampled from the distribution that generated the training set. When the data distribution is unknown, PE cannot be computed. Therefore, it is estimated to select a particular model within a learning algorithm and to predict its performance. In experiments aiming at comparing learning algorithms, the expected prediction error EPE (where the expectation is taken over training data sets) may be considered as a more relevant criterion [2]. If the amount of data is large enough, PE can be estimated by the mean error over a hold-out test set. The usual variance estimates for means of independent samples can then be computed to derive error bars on the estimated prediction error, and to assess the statistical significance of differences between models. Note however that these procedures do not aim at estimating the variance w.r.t. estimates of EPE. The hold out technique makes an inefficient use of data which forbids its application to small sample sizes. In this situation, one resorts to computer intensive resampling methods such as cross-validation or bootstrap to estimate EPE. We focus on estimating the variance of EPE, the estimate provided by K-fold cross-validation. While it is known that cross-validation provides an unbiased estimate of EPE, it is also known that its variance may be very large [1]. An accurate estimate of variance should provide means to build confidence intervals on EPE, and to derive statistics to test the significance of observed differences between algorithms .
منابع مشابه
No Unbiased Estimator of the Variance of K-Fold Cross-Validation
Most machine learning researchers perform quantitative experiments to estimate generalization error and compare algorithm performances. In order to draw statistically convincing conclusions, it is important to estimate the uncertainty of such estimates. This paper studies the estimation of uncertainty around the K-fold cross-validation estimator. The main theorem shows that there exists no univ...
متن کاملLong-term Streamflow Forecasting by Adaptive Neuro-Fuzzy Inference System Using K-fold Cross-validation: (Case Study: Taleghan Basin, Iran)
Streamflow forecasting has an important role in water resource management (e.g. flood control, drought management, reservoir design, etc.). In this paper, the application of Adaptive Neuro Fuzzy Inference System (ANFIS) is used for long-term streamflow forecasting (monthly, seasonal) and moreover, cross-validation method (K-fold) is investigated to evaluate test-training data in the model.Then,...
متن کاملMean squared error of prediction (MSEP) estimates for principal component regression (PCR) and partial least squares regression (PLSR)∗
The paper presents results from simulations based on real data, comparing several competing mean squared error of prediction (MSEP) estimators on principal components regression (PCR) and partial least squares regression (PLSR): leave-one-out crossvalidation, K-fold and adjusted K-fold cross-validation, the ordinary bootstrap estimate, the bootstrap smoothed cross-validation (BCV) estimate and ...
متن کاملEfficient Computation of Unconditional Error Rate Estimators for Learning Algorithms and an Application to a Biomedical Data Set Master Thesis
We derive an unbiased variance estimator for re-sampling procedures using the fact that those procedures are incomplete U-statistics. Our approach is based on careful examination of the combinatorics governing the covariances between re-sampling iterations. We establish such an unbiased variance estimator for the special case of K-Fold cross-validation. This estimator exists as soon as new obse...
متن کاملCross-Validation and Mean-Square Stability
k-fold cross validation is a popular practical method to get a good estimate of the error rate of a learning algorithm. Here, the set of examples is first partitioned into k equal-sized folds. Each fold acts as a test set for evaluating the hypothesis learned on the other k − 1 folds. The average error across the k hypotheses is used as an estimate of the error rate. Although widely used, espec...
متن کامل